What Is Claimed Is: 

1 . A method for generating code to perform anticipatory prefetching 
for data references, comprising: 

receiving code to be executed on a computer system; 
analyzing the code to identify data references to be prefetched; and 

inserting prefetch instructions into the code in advance of the identified 
data references, wherein inserting the prefetch instructions involves, 

attempting to calculate a stride value for a given data 
reference within a loop, 

if the stride value cannot be calculated, setting the stride 
value to a default stride value, and 
inserting a prefetch instruction to prefetch the given data reference for a 
subsequent loop iteration based on the stride value. 

2. The method of claim 1 , further comprising allowing a system user 
to specify the default stride value. 

3. The method of claim 1, wherein calculating the stride value 
involves: 

identifying an induction variable for the stride value; 
identifying a stride function for the stride value; and 
calculating the stride value based upon the stride function and the 
induction variable. 

4. The method of claim 1 , wherein inserting the prefetch instruction 
based on the stride value involves: 

calculating a prefetch cover distance by dividing a cache line size by the 
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stride value; 

calculating a prefetch ahead distance as a function of a prefetch latency, 
the prefetch cover distance and an execution time of a loop; and 

calculating a prefetch address by multiplying the stride value by the 
prefetch cover distance and the prefetch ahead distance and adding the result to an 
address accessed by the given data reference. 

5. The method of claim 1, wherein analyzing the code involves: 
identifying loop bodies within the code; and 

identifying data references to be prefetched from within the loop bodies. 

6. The method of claim 5, wherein analyzing the code to identify data 
references to be prefetched involves examining a pattern of data references over 
multiple loop iterations. 

7. The method of claim 1, wherein analyzing the code involves 
analyzing the code within a compiler. 

8. A computer-readable storage medium storing instructions that 
when executed by a computer cause the computer to perform a method for 
generating code to perform anticipatory prefetching for data references, the 
method comprising: 

receiving code to be executed on a computer system; 
analyzing the code to identify data references to be prefetched; and 

inserting prefetch instructions into the code in advance of the identified 
data references, wherein inserting the prefetch instructions involves, 

attempting to calculate a stride value for a given data 
reference within a loop, 
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if the stride value cannot be calculated, setting the stride 
value to a default stride value, and 

inserting a prefetch instruction to prefetch the given data 
reference for a subsequent loop iteration based on the stride value. 

9. The computer-readable storage medium of claim 8, wherein the 
method further comprises allowing a system user to specify the default stride 
value. 

10. The computer-readable storage medium of claim 8, wherein 
calculating the stride value involves: 

identifying an induction variable for the stride value; 
identifying a stride function for the stride value; and 
calculating the stride value based upon the stride function and the 
induction variable. 

11. The computer-readable storage medium of claim 8, wherein 
inserting the prefetch instruction based on the stride value involves: 

calculating a prefetch cover distance by dividing a cache line size by the 
stride value; 

calculating a prefetch ahead distance as a function of a prefetch latency, 
the prefetch cover distance and an execution time of a loop; and 

calculating a prefetch address by multiplying the stride value by the 
prefetch cover distance and the prefetch ahead distance and adding the result to an 
address accessed by the given data reference. 

12. The computer-readable storage medium of claim 8, wherein 
analyzing the code involves analyzing the code within a compiler. 
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13. An apparatus that generates code to perform anticipatory 
prefetching for data references, comprising: 

a receiving mechanism that is configured to receive code to be executed 
on a computer system; 

an analysis mechanism that is configured to analyze the code to identify 
data references to be prefetched; and 

an insertion mechanism that is configured to insert prefetch instructions 
into the code in advance of the identified data references; 

wherein the insertion mechanism is configured to, 

attempt to calculate a stride value for a given data reference 
within a loop, 

set the stride value to a default stride value if the stride 
value cannot be calculated, and to 

insert a prefetch instruction to prefetch the given data 
reference for a subsequent loop iteration based on the stride value. 

14. The apparatus of claim 13, further comprising a configuration 
mechanism that is configured to receive the default stride value from a system 
user. 

15. The apparatus of claim 13, wherein while calculating the stride 
value, the insertion mechanism is configured to: 

identify an induction variable for the stride value; 
identify a stride function for the stride value; and to 

calculate the stride value based upon the stride function and the induction 
variable. 
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1 6. The apparatus of claim 1 3 , wherein the insertion mechanism is 
configured to: 

calculate a prefetch cover distance by dividing a cache line size by the 
stride value; 

calculate a prefetch ahead distance as a function of a prefetch latency, the 
prefetch cover distance and an execution time of a loop; and to 

calculate a prefetch address by multiplying the stride value by the prefetch 
cover distance and the prefetch ahead distance and adding the result to an address 
accessed by the given data reference. 

1 7. The apparatus of claim 13, wherein the apparatus resides within a 
compiler. 

18. A method for generating code to perform anticipatory prefetching 
for data references, comprising: 

receiving code to be executed on a computer system; 

analyzing the code to identify data references to be prefetched; and 

inserting prefetch instructions into the code in advance of the identified 

data references so that multiple prefetch instructions are issued for a given data 

reference; 

whereby the given data reference is prefetched even if the computer 
system drops a prefetch instruction for the given data reference. 

19. The method of claim 18, wherein inserting prefetch instructions 
involves ensuring that the multiple prefetch instructions for the given data 
reference are issued at different times, so that a single event is unlikely to cause 
all of the multiple prefetch instructions for the given data reference to be dropped 
by the computer system. 
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20. The method of claim 1 8, wherein inserting prefetch instructions 

involves issuing each of the multiple prefetch instructions for the given data 
reference in a different loop iteration. 



2 1 . The method of claim 1 8, wherein analyzing the code involves: 
identifying loop bodies within the code; and 

identifying data references to be prefetched from within the loop bodies. 

22. The method of claim 2 1 , wherein analyzing the code to identify 
data references to be prefetched involves examining a pattern of data references 
over multiple loop iterations. 

23. The method of claim 1 8, wherein analyzing the code involves 
analyzing the code within a compiler. 

24. A computer-readable storage medium storing instructions that 
when executed by a computer system cause the computer system to perform a 
method for generating code to perform anticipatory prefetching for data 
references, the method comprising: 

receiving code to be executed on the computer system; 

analyzing the code to identify data references to be prefetched; and 

inserting prefetch instructions into the code in advance of the identified 

data references so that multiple prefetch instructions are issued for a given data 

reference; 

whereby the given data reference is prefetched even if the computer 
system drops a prefetch instruction for the given data reference. 
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25 . The computer-readable storage medium of claim 24, wherein 
inserting prefetch instructions involves ensuring that the multiple prefetch 
instructions for the given data reference are issued at different times, so that a 
single event is unlikely to cause all of the multiple prefetch instructions for the 
given data reference to be dropped by the computer system. 

26. The computer-readable storage medium of claim 24, wherein 
inserting prefetch instructions involves issuing each of the multiple prefetch 
instructions for the given data reference in a different loop iteration. 

27. The computer-readable storage medium of claim 24, wherein 
analyzing the code involves analyzing the code within a compiler. 

28. An apparatus that generates code to perform anticipatory 
prefetching for data references, comprising: 

a receiving mechanism that is configured to receive code to be executed 
on a computer system; 

an analysis mechanism that is configured to analyze the code to identify 
data references to be prefetched; and 

an insertion mechanism that is configured to insert prefetch instructions 
into the code in advance of the identified data references so that multiple prefetch 
instructions are issued for a given data reference; 

whereby the given data reference is prefetched even if the computer 
system drops a prefetch instruction for the given data reference. 



29. The apparatus of claim 28, wherein the insertion mechanism is 
configured to ensure that the multiple prefetch instructions for the given data 
reference are issued at different times, so that a single event is unlikely to cause 
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all of the multiple prefetch instructions for the given data reference to be dropped 
by the computer system. 

30. The apparatus of claim 28, wherein the insertion mechanism is 
configured to issue each of the multiple prefetch instructions for the given data 
reference in a different loop iteration. 

3 1 . The apparatus of claim 28, wherein the apparatus resides within a 
compiler. 

32. A method for generating code to perform anticipatory prefetching 
for data references, comprising: 

receiving code to be executed on a computer system; 
analyzing the code to identify data references to be prefetched; and 
inserting prefetch instructions into the code in advance of the identified 
data references; 

wherein inserting the prefetch instructions involves, 

identifying a location in the code where a prefetch address 
for a given prefetch instruction is calculated, and 

inserting the given prefetch instruction as far ahead of a 
corresponding data reference operation as possible, but not before 
the location where the prefetch address is calculated. 

33. The method of claim 32, wherein inserting the given prefetch 
instruction can involve inserting the given prefetch instruction into a preceding 
block in the code. 
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34. The method of claim 33, wherein inserting the given prefetch 
instruction involves: 

tracing execution of the code to produce an execution trace; 

using the execution trace to identify a preceding block in which the 
prefetch address is calculated; and 

inserting the given prefetch instruction into the preceding block after the 
location where the prefetch address is calculated. 



35. The method of claim 32, wherein analyzing the code involves: 
identifying loop bodies within the code; and 

identifying data references to be prefetched from within the loop bodies. 

36. The method of claim 35, wherein analyzing the code to identify 
data references to be prefetched involves examining a pattern of data references 
over multiple loop iterations. 



37. The method of claim 32, wherein analyzing the code involves 
analyzing the code within a compiler. 



38. A computer-readable storage medium storing instructions that 
when executed by a computer cause the computer to perform a method for 
generating code to perform anticipatory prefetching for data references, the 
method comprising: 

receiving code to be executed on a computer system; 

analyzing the code to identify data references to be prefetched; and 

inserting prefetch instructions into the code in advance of the identified 
data references; 

wherein inserting the prefetch instructions involves, 
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identifying a location in the code where a prefetch address 
for a given prefetch instruction is calculated, and 

inserting the given prefetch instruction as far ahead of a 

corresponding data reference operation as possible, but not before 

the location where the prefetch address is calculated. 

39. The computer-readable storage medium of claim 38, wherein 
inserting the given prefetch instruction can involve inserting the given prefetch 
instruction into a preceding block in the code. 

40. The computer-readable storage medium of claim 38, wherein 
inserting the given prefetch instruction involves: 

tracing execution of the code to produce an execution trace; 

using the execution trace to identify a preceding block in which the 
prefetch address is calculated; and 

inserting the given prefetch instruction into the preceding block after the 
location where the prefetch address is calculated. 

41 . The computer-readable storage medium of claim 38, wherein 
analyzing the code involves analyzing the code within a compiler. 

42. An apparatus that generates code to perform anticipatory 
prefetching for data references, comprising: 

a receiving mechanism that is configured to receive code to be executed 
on a computer system; 

an analysis mechanism that is configured to analyze the code to identify 
data references to be prefetched; and 
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an insertion mechanism that is configured to insert prefetch instructions 
into the code in advance of the identified data references; 
wherein the insertion mechanism is configured to, 

identify a location in the code where a prefetch address for 

a given prefetch instruction is calculated, and to 

insert the given prefetch instruction as far ahead of a 
corresponding data reference operation as possible, but not before 
the location where the prefetch address is calculated. 

43 . The apparatus of claim 42, wherein the insertion mechanism is 
configured to insert the given prefetch instruction into a preceding block in the 
code. 

44. The apparatus of claim 43, wherein the insertion mechanism is 
configured to: 

trace execution of the code to produce an execution trace; 
use the execution trace to identify a preceding block in which the prefetch 
address is calculated; and to 

insert the given prefetch instruction into the preceding block after the 
location where the prefetch address is calculated. 

45. The apparatus of claim 42, wherein the apparatus resides within a 
compiler. 
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